Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: move prompt templates to files (#593) #596

Merged

Conversation

nautics889
Copy link
Contributor

@nautics889 nautics889 commented Sep 26, 2023

This commit contains refactoring updates related to #593.
Brief summary:

  • rename Prompt -> AbstractPrompt
  • inheritors of AbstractPrompt now should implement property template (before text attribute had the same sense, but tempalte is more accurate IMHO for this case)
  • add FileBasedPrompt (extends AbstractPrompt)
  • add directory assets/prompt-templates/
  • move content of CorrectErrorPrompt and GeneratePythonCodePrompt to assets/prompt-templates/correct_error_prompt.tmpl and assets/prompt-templates/generate_python_code.tmpl respectively
  • CorrectErrorPrompt and GeneratePythonCodePrompt now extend FileBasedPrompt
  • Update docs accordingly
  • Update tests accordingly

  • (refactor): define base prompt class as an abstract
  • (refactor): attribute text rename to template (since it's essentially is a template)
  • (feat): add FileBasedPrompt

Summary by CodeRabbit

  • New Feature: Introduced AbstractPrompt and FileBasedPrompt classes to support a more flexible prompt system, including file-based prompts.
  • Refactor: Replaced all instances of the Prompt class with AbstractPrompt across the codebase for consistency.
  • Bug Fix: Addressed an issue with inconsistent line endings in test cases across different platforms.
  • New Feature: Added new exception classes TemplateFileNotFoundError and LibraryImportError for better error handling and reporting.
  • Test: Updated unit tests to reflect changes in the prompt system and added new tests for the AbstractPrompt class.
  • Chore: Updated documentation to reflect changes in the prompt system.

* (refactor): define base prompt class as an abstract
* (refactor): attribute `text` rename to `template` (since it's
  essentially is a template)
* (feat): add FileBasedPrompt
* (style): resolve naming issues
* (tests): CRLF to LF on win platforms
* (fix): revert undesirable change (during autorefactoring the name of
  prompt's attribute was excessively renamed where it shouldn't have
  been)
* (docs): update docstrings, update custom-prompts.md
* (chore): enhance type hinting
* (fix): add missing `FileBasedPrompt` in module __init__.py
@codecov-commenter
Copy link

codecov-commenter commented Sep 27, 2023

Codecov Report

Merging #596 (3028b35) into main (57c5259) will decrease coverage by 0.25%.
Report is 6 commits behind head on main.
The diff coverage is 84.21%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@            Coverage Diff             @@
##             main     #596      +/-   ##
==========================================
- Coverage   83.90%   83.65%   -0.25%     
==========================================
  Files          55       55              
  Lines        2690     2717      +27     
==========================================
+ Hits         2257     2273      +16     
- Misses        433      444      +11     
Files Coverage Δ
pandasai/__init__.py 89.79% <100.00%> (ø)
pandasai/llm/azure_openai.py 89.36% <100.00%> (ø)
pandasai/llm/base.py 88.97% <100.00%> (ø)
pandasai/llm/fake.py 100.00% <100.00%> (ø)
pandasai/llm/langchain.py 100.00% <100.00%> (ø)
pandasai/llm/openai.py 100.00% <100.00%> (ø)
pandasai/prompts/__init__.py 100.00% <100.00%> (ø)
pandasai/prompts/correct_error_prompt.py 100.00% <100.00%> (ø)
pandasai/prompts/generate_python_code.py 100.00% <100.00%> (ø)
pandasai/smart_datalake/__init__.py 89.32% <100.00%> (-0.26%) ⬇️
... and 3 more

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@nautics889 nautics889 marked this pull request as ready for review September 27, 2023 18:21
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 27, 2023

Walkthrough

This pull request introduces a significant refactoring of the Prompt class into an abstract base class AbstractPrompt, and a file-based prompt FileBasedPrompt. It also includes changes to handle platform-specific line endings, and introduces new exception classes for better error handling.

Changes

File(s) Summary
pandasai/prompts/base.py, pandasai/exceptions.py Introduced AbstractPrompt and FileBasedPrompt classes. Added TemplateFileNotFoundError and LibraryImportError exceptions.
assets/prompt-templates/*.tmpl, pandasai/prompts/*.py Refactored existing prompts to inherit from FileBasedPrompt and use template files.
pandasai/llm/*.py, pandasai/smart_datalake/__init__.py Updated import statements and type hints to use AbstractPrompt.
tests/*.py Updated tests to reflect changes in prompt classes and handle platform-specific line endings.

🐇💻

"In the land of code where the shadows lie,

An abstract class was born under the binary sky.

With templates and files, it began to thrive,

In the heart of the repo, it came alive.

Exception classes joined the dance,

Enhancing the code's robust stance.

Tests were updated, all in a row,

Celebrating the changes, in the soft glow."


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • Mention @coderabbitai in any review comment for bot assistance.
  • Note: Review comments are made on code diffs or files, not on the PR overview.

Pause Incremental Reviews

  • Insert @coderabbitai: ignore in the PR description to halt the bot's ongoing reviews. Remove the line to resume.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Commits Files that changed from the base of the PR and between 95105eb and 6a4695d.
Files selected for processing (26)
  • assets/prompt-templates/correct_error_prompt.tmpl (1 hunks)
  • assets/prompt-templates/generate_python_code.tmpl (1 hunks)
  • docs/custom-prompts.md (2 hunks)
  • pandasai/init.py (2 hunks)
  • pandasai/llm/azure_openai.py (2 hunks)
  • pandasai/llm/base.py (5 hunks)
  • pandasai/llm/fake.py (2 hunks)
  • pandasai/llm/huggingface_text_gen.py (5 hunks)
  • pandasai/llm/langchain.py (2 hunks)
  • pandasai/llm/openai.py (2 hunks)
  • pandasai/prompts/init.py (1 hunks)
  • pandasai/prompts/base.py (2 hunks)
  • pandasai/prompts/correct_error_prompt.py (1 hunks)
  • pandasai/prompts/generate_python_code.py (1 hunks)
  • pandasai/smart_datalake/init.py (2 hunks)
  • tests/connectors/test_base.py (1 hunks)
  • tests/llms/test_base_hf.py (3 hunks)
  • tests/llms/test_google_palm.py (2 hunks)
  • tests/llms/test_huggingface_text_gen.py (4 hunks)
  • tests/llms/test_langchain_llm.py (2 hunks)
  • tests/llms/test_openai.py (2 hunks)
  • tests/prompts/test_base_prompt.py (1 hunks)
  • tests/prompts/test_correct_error_prompt.py (2 hunks)
  • tests/prompts/test_generate_python_code_prompt.py (3 hunks)
  • tests/test_smartdataframe.py (4 hunks)
  • tests/test_smartdatalake.py (2 hunks)
Files skipped from review due to trivial changes (12)
  • assets/prompt-templates/correct_error_prompt.tmpl
  • pandasai/llm/fake.py
  • pandasai/prompts/init.py
  • pandasai/prompts/correct_error_prompt.py
  • tests/connectors/test_base.py
  • tests/llms/test_huggingface_text_gen.py
  • tests/llms/test_langchain_llm.py
  • tests/llms/test_openai.py
  • tests/prompts/test_base_prompt.py
  • tests/prompts/test_correct_error_prompt.py
  • tests/prompts/test_generate_python_code_prompt.py
  • tests/test_smartdatalake.py
Additional comments (Suppressed): 39
assets/prompt-templates/generate_python_code.tmpl (1)
  • 1-30: The new hunk is a template file for generating Python code prompts. It provides placeholders for dataframes, conversation, default imports, and engine dataframe name. The template also includes instructions on how to update the initial Python code based on the last question in the conversation. This change enhances modularity by separating the prompt content from the logic of the GeneratePythonCodePrompt class.
pandasai/llm/langchain.py (2)
  • 1-4: The import statement has been updated to reflect the renaming of Prompt to AbstractPrompt. Ensure that the AbstractPrompt class provides all the necessary functionality previously provided by Prompt.

  • 16-18: The function signature for call has been updated to use AbstractPrompt instead of Prompt. Make sure that all calls to this function throughout the codebase have been updated to match the new signature.

pandasai/llm/openai.py (2)
  • 18-18: The import statement has been updated to reflect the renaming of Prompt to AbstractPrompt. Ensure that all references to Prompt in the codebase have been updated accordingly.

  • 88-88: The function signature for call has been updated to accept an AbstractPrompt instead of a Prompt. Make sure that all calls to this function throughout the codebase have been updated to match the new signature.

pandasai/llm/azure_openai.py (2)
  • 21-21: The import statement has been updated to reflect the renaming of Prompt to AbstractPrompt. Ensure that all references to Prompt in this file have been updated accordingly.

  • 108-108: The function signature for call() has been updated to accept an AbstractPrompt object instead of a Prompt object. Make sure that all calls to this function throughout the codebase pass an instance of a class that extends AbstractPrompt.

pandasai/__init__.py (2)
  • 43-43: The import statement has been updated to reflect the renaming of Prompt to AbstractPrompt. Ensure that all references to Prompt in the codebase have been updated accordingly.

  • 115-115: The type hint for non_default_prompts has been updated from Prompt to AbstractPrompt. This change is consistent with the renaming of the base class. Make sure that any custom prompts passed to this function are subclasses of AbstractPrompt.

pandasai/prompts/generate_python_code.py (3)
  • 36-37: The base class Prompt has been renamed to FileBasedPrompt. This change is consistent with the PR summary and seems to be a part of the refactoring effort to improve modularity. Ensure that all instances of Prompt have been updated to FileBasedPrompt throughout the codebase.

  • 42-43: The _path_to_template attribute points to the new template file for this prompt. Make sure that the specified file exists at the given path and contains the expected content.

  • 44-50: The __init__() method now sets two variables, default_import and engine_df_name, using the set_var() method before calling super().__init__(**kwargs). This is a change from the old hunk where these variables were set after the super call. This change should not cause any issues as long as set_var() does not rely on any attributes or methods that are initialized in the superclass's __init__() method.

tests/llms/test_google_palm.py (2)
  • 6-12: The import statement has been updated to reflect the renaming of Prompt to AbstractPrompt. Ensure that all references to Prompt in the codebase have been updated accordingly.

  • 19-25: The MockPrompt class used in the prompt fixture has been renamed to MockAbstractPrompt, and its text attribute has been renamed to template. This change aligns with the renaming of Prompt to AbstractPrompt and the replacement of the text attribute with template in the base class. Ensure that all instances of MockPrompt and its text attribute have been updated throughout the test suite.

docs/custom-prompts.md (3)
  • 16-34: The new hunk demonstrates how to create a custom prompt by subclassing AbstractPrompt and overriding the template property. This is a good example of using inheritance and encapsulation, two principles of object-oriented programming. The code looks correct and should work as expected if the my_custom_value argument is provided when creating an instance of MyCustomPrompt.

  • 44-59: This hunk shows how to use FileBasedPrompt to create a custom prompt with the template content stored in a file. It's important to ensure that the path specified in _path_to_template is correct and the file exists at that location. If the file doesn't exist or can't be read, it could cause runtime errors.

  • 66-80: This hunk updates the previous example of creating a custom prompt to use the new AbstractPrompt class and the template attribute instead of text. The change is consistent with the rest of the PR and should not cause any issues as long as the dfs and conversation variables are available in the context where the template is rendered.

tests/test_smartdataframe.py (4)
  • 20-20: The import statement has been updated to reflect the renaming of Prompt to AbstractPrompt. Ensure that all references to Prompt in the codebase have been updated accordingly.

  • 380-384: The custom prompt class CustomPrompt now extends AbstractPrompt instead of Prompt, and uses template instead of text. This change aligns with the refactoring of the prompt system. However, ensure that the template string is correctly formatted and all variables within {} are available in the context when this prompt is used.

  • 401-407: The ReplacementPrompt class now extends AbstractPrompt and overrides the template property instead of defining a text attribute. This change is consistent with the refactoring of the prompt system. However, ensure that the return value of the template property is correctly formatted and all variables within {} are available in the context when this prompt is used.

  • 515-517: The custom_prompts dictionary now uses an instance of GeneratePythonCodePrompt instead of Prompt. This change is consistent with the introduction of the new FileBasedPrompt subclass and the refactoring of the prompt system. Ensure that the GeneratePythonCodePrompt class is correctly implemented and its template file contains valid content.

pandasai/smart_datalake/__init__.py (2)
  • 35-37: The import statement for Prompt has been replaced with AbstractPrompt. This change aligns with the refactoring of the prompt system where Prompt is renamed to AbstractPrompt. Ensure that all references to Prompt in the codebase have been updated to AbstractPrompt.

  • 204-223: The function _get_prompt() now returns an instance of AbstractPrompt instead of Prompt. The argument default_prompt also expects a type of AbstractPrompt instead of Prompt. This change is consistent with the renaming of Prompt to AbstractPrompt. Make sure that all calls to this function are updated accordingly.

tests/llms/test_base_hf.py (3)
  • 7-7: The import statement has been updated to reflect the renaming of Prompt to AbstractPrompt. Ensure that all references to Prompt in the codebase have been updated accordingly.

  • 19-22: The fixture prompt now uses a mock class MockAbstractPrompt that extends AbstractPrompt instead of Prompt. The attribute text has been replaced with template, reflecting its role as a template for generating prompts. This change aligns with the refactoring of the prompt system in the pandas-ai project.

  • 65-68: Similar to the previous comment, the test test_call_removes_original_prompt now uses a mock class MockAbstractPrompt that extends AbstractPrompt instead of Prompt. The attribute text has been replaced with template.

pandasai/llm/base.py (5)
  • 32-32: The import statement has been updated to reflect the renaming of Prompt to AbstractPrompt. Ensure that all references to Prompt in the codebase have been updated accordingly.

  • 123-123: The function signature for call() has been updated to accept an AbstractPrompt instead of a Prompt. Make sure all calls to this function throughout the codebase have been updated to match the new signature.

  • 137-137: The function signature for generate_code() has been updated to accept an AbstractPrompt instead of a Prompt. Make sure all calls to this function throughout the codebase have been updated to match the new signature.

  • 337-337: The function signature for call() in the HuggingFaceLLM class has been updated to accept an AbstractPrompt instead of a Prompt. Make sure all calls to this function throughout the codebase have been updated to match the new signature.

  • 432-432: The function signature for call() in the BaseGoogle class has been updated to accept an AbstractPrompt instead of a Prompt. Make sure all calls to this function throughout the codebase have been updated to match the new signature.

pandasai/llm/huggingface_text_gen.py (5)
  • 2-8: The import statement for AbstractPrompt is correct and aligns with the changes described in the PR summary. Ensure that all references to Prompt have been updated to AbstractPrompt throughout the codebase.

  • 14-20: No significant changes were made in this hunk, only a minor change in the formatting of the temperature value from 1E-3 to 1e-3. Both are equivalent in Python.

  • 29-35: No significant changes were made in this hunk, only a minor change in the formatting of the for loop. The parentheses around (key, val) in the old hunk are not necessary in Python and have been correctly removed in the new hunk.

  • 63-71: The function signature of call() has been updated to accept an AbstractPrompt instead of a Prompt. This aligns with the changes described in the PR summary. Also, the attribute text has been replaced with template in line 70, which is consistent with the PR summary. Make sure all calls to this function have been updated accordingly.

- for chunk in self.client.generate_stream(prompt, **params):
-    completion += chunk.text
+ for chunk in self.client.generate_stream(prompt, **params):
+    completion += chunk.template
  • 76-81: No significant changes were made in this hunk. The code remains the same as before.
pandasai/prompts/base.py (3)
  • 1-16: The Prompt class has been renamed to AbstractPrompt and now inherits from the ABC (Abstract Base Class) module. This is a good practice for classes that are meant to be used as base classes, as it makes it clear that they should not be instantiated directly. The text attribute has been removed, which suggests that subclasses will need to provide their own implementation of the prompt text or template.

  • 48-65: The to_string method no longer raises a MethodNotImplementedError if the template attribute is None. Instead, it seems to rely on the abstractmethod decorator on the template property to ensure that subclasses provide an implementation. This is a cleaner approach, as it leverages Python's built-in support for abstract methods.

  • 68-90: A new subclass FileBasedPrompt has been introduced, which reads the prompt template from a file. The path to the template file is stored in the _path_to_template attribute, and the template property reads the file content when accessed. This is a good design choice, as it allows for more flexible and maintainable prompts by separating the template content from the code. However, there is no error handling for potential issues with reading the file, such as permission errors or encoding issues. Consider adding a try-except block around the file reading operation to handle these cases gracefully.

</blockquote></details></blockquote></details>



</details>

pandasai/prompts/base.py Outdated Show resolved Hide resolved
* (refactor): change exception handling when reading temaplte content
  from a file
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Commits Files that changed from the base of the PR and between 6a4695d and 3bfd998.
Files selected for processing (2)
  • pandasai/exceptions.py (1 hunks)
  • pandasai/prompts/base.py (2 hunks)
Additional comments (Suppressed): 4
pandasai/prompts/base.py (4)
  • 1-17: The Prompt class has been renamed to AbstractPrompt and now inherits from the ABC (Abstract Base Class) module. This change enforces that any class inheriting from AbstractPrompt must implement the template property, making it more explicit what methods or properties a child class should implement. The text attribute has been removed and replaced by the abstract template property.

  • 54-55: The template property is declared as an abstract method, which means any class that extends AbstractPrompt must provide an implementation for this property. This is a good practice as it makes the interface of the base class clear and ensures that all subclasses adhere to the same interface.

  • 62-66: The to_string method now uses the template property instead of the text attribute to format the string representation of the prompt. This change aligns with the removal of the text attribute and the introduction of the template property. Ensure that all subclasses of AbstractPrompt correctly implement the template property.

  • 69-95: A new subclass FileBasedPrompt has been introduced, which extends AbstractPrompt. This class is designed for prompts that read template content from files. It introduces a _path_to_template attribute and overrides the template property to read the template content from the file specified by _path_to_template. Error handling has been added to raise a TemplateFileNotFoundError if the template file cannot be found, and a RuntimeError if there's an issue reading the file. This is a good practice as it separates the concerns of different types of prompts and improves error handling.

pandasai/exceptions.py Outdated Show resolved Hide resolved
* (docs): fix docstring for custom exception
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Commits Files that changed from the base of the PR and between 3bfd998 and 3028b35.
Files selected for processing (1)
  • pandasai/exceptions.py (1 hunks)
Files skipped from review due to trivial changes (1)
  • pandasai/exceptions.py

@gventuri
Copy link
Collaborator

Great job @nautics889, way better and more maintainable! Thanks a lot, merging!

@gventuri gventuri merged commit b480df1 into Sinaptik-AI:main Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move prompt templates to separate files
3 participants